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Abstract. We prove positivity of the Markov operators that cor- 
respond to the hit-and-run algorithm, random scan Gibbs sampler, 
slice sampler and an Metropolis algorithm with positive proposal. 
In all of these cases the positivity is independent of the state space 
and the stationary distribution. In particular, the results show that 
it is not necessary to consider the lazy versions of these Markov 
chains. The proof relies on a well known lemma which relates the 
positivity of the product MTM*, for some operators M and T, to 
the positivity of T. It remains to find that kind of representation 
of the Markov operator with a positive operator T. 



1. Introduction 

In many applications, for example volume computation |LS93| IKLS97t 
ILVn6b] or integration of functions |LVn6a[ IMNOTi IRudOQi [R;idT2] . it is 
essential that one can approximately sample a distribution in a convex 
body. The dimension d might be very large. One approach that is 
feasible for a general class of problems is to run a Markov chain that 
has the desired distribution as its limit distribution. In the following 
let us explain why the positivity of the Markov operator is helpful to 
prove efficiency results for such sampling procedures. 

We assume that we have a Markov chain in C M"^ which is reversible 
with respect to (w.r.t.) the distribution vr. Let P: L2{j^) — )■ L2{j^) be 
the corresponding Markov operator and let -^2(71") be all (w.r.t. vr) 
square integrable functions f:K — > M. We assume that P is er- 
godic, which means that Pf = f implies that / is constant. Then 
let gap(P) = 1 — /3 be the absolute spectral gap, where /3 denotes the 
largest absolute value of the elements of the spectrum of P without 1. 
In formulas (3 = sup{|a| : a G spec(P) \ 1}, where spec(P) denotes the 
spectrum of P. For example a lower bound for gap(P) implies an upper 
bound of the total variation distance |LS93j and on the mean square 
error of Markov chain Monte Carlo algorithms for the approximation 
of expectations with respect to vr, see e.g. |Rudl2j . 
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Maybe the most successful technique to bound gap(P) is the conduc- 
tance technique |LS88t lL"S93j . But, unfortunately, bounds on the con- 
ductance allow only bounds on the second largest element of the spec- 
trum of the Markov operator. This is known as Cheeger's inequality 
|LS88] ■ To handle variation distance and absolute spectral gap it is 
necessary to consider also the smallest element of the spectrum, which 
describes some kind of periodicity of the Markov chain. Usually, this 
problem is avoided by considering a lazy version of the Markov chain. 
That is, in each step, the Markov chain remains at the current state 
with probability 1/2. Such a lazy version induces a Markov operator 
with non-negative spectrum, which implies that the smallest element 
of the spectrum does not matter. This strategy has almost no influence 
on the computational cost, since, compared to the overall cost of one 
step of the chain, one additional random number is mostly negligible. 
However, it is desirable to omit any slowdown whenever possible. 

In the following we present a technique to prove that the spectrum of a 
Markov operator is positive, which is based on a simple and well known 
lemma from functional analysis. This technique was already success- 
fully applied in a discrete setting to prove positivity (and compari- 
son results) for the Swendsen-Wang process from statistical physics, 
see |U1112at IU1112b] . Here, we show that the hit-and-run algorithm, 
random scan Gibbs sampler, slice sampler and the Metropolis algo- 
rithm with positive proposal are positive. In particular, it implies that 
the independent Metropolis algorithm is positive. The result is new for 
the hit-and-run algorithm and the Metropolis algorithm with positive 
proposal, whereas for the random scan Gibbs sampler and the slice 
sampler it is known |LWK95l[MT02] . 

2. The procedure 

We consider a time-homogeneous Markov chain (Xj)jgp!}, where the Xi 
are random variables on a common probability space {Q, J-", P) that 
map into M.'^, equipped with the Borel a-algebra B, and satisfy the 
Markov property. Namely, 

P(X„ G A„ I X„_i G ...,Xo G Ao) = P(X„ G A„ | X„_i G 

for all 72 > 1 and any sequence of i3-measurable sets Aq,Ai,... with 
P(X„_i G An-i, ...,Xo G Aq) > 0. We assume that the Markov chain 
has a unique stationary distribution vr and that it is reversible with 
respect to this measure. For a more comprehensive introduction to the 
theory of Markov chains we refer to [MT091 IRR04] . 

To every Markov chain (Xj)jgN corresponds a Markov kernel P: 'R'^ x 
B ^ [0, 1] such that for each x eW^, P{ probability measure 
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on B and, for each A E B, P{-, A) is i3- measurable. This Markov kernel 
is given by 

P{x, A) = ¥{Xn+i eA\Xn = x), X eM.'^, AeB, neN 

and describes the probability that the Markov chain reaches the set A 
in one step from x. Using this Markov kernel we define the Markov 
operator P (for notational convenience we use the same letter as for 
the Markov kernel) by 

Pf{x) = f f{y)P{x,dy), xGM" 

for all functions f E L2 = -^v2(vr), where L2 is the Hilbert space of 
functions on / : M*^ — )■ M with inner product 

if, 9) = / fix) g{x)dn{x). 

By reversibility of the Markov chain we know that P is a self-adjoint 
operator on L2. A self-adjoint operator P is called positive, written 
P > 0, if 

(PfJ) > 0, V/GL2. 
It is well known that positive operators have only non-negative spec- 
trum, for further details see for example |Kre89j . 

Our aim is to show that several Markov chains that are used to sample 
from distributions in induce positive Markov operators. In this case, 
we say that the Markov chain is positive. We will basically utilize the 
following lemma. 

Lemma 1. Let Hi and H2 be Hilbert spaces and M : Hi ^ H2 be a 

bounded, linear operator. Let M* be the adjoint operator of M and 
let T : H2 ^ H2 be a bounded, linear and positive operator. Then 
MTM* : Hi^ Hi is also positive. 

Proof. We denote the inner product of Hi by (■, ■)j for i = 1,2. By the 
definition of the adjoint operator and positivity of T, 

{MTM*f,f)i = {TM*f,M*f)2 > 0. 

This proves the statement. □ 

Suppose we have an operator P : Hi ^ Hi on a Hilbert space with the 
property that it can be written as P = MTM*, where T: H2 ^ H2, 
M: Hi ^ H2 and M* is the adjoint of M for some (other) Hilbert 
space H2. If we can show, additionally, that T is a positive operator, 
we obtain by the lemma above that P is also positive. Thus, the 
proof of positivity of the Markov chains under consideration is done 
by a construction of a suitable second Hilbert space such that the 
corresponding Markov operator can be written in the above mentioned 
form. 
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3. Applications 

Throughout this section we consider Markov chains in a bounded subset 

K of with non-empty interior. Additionally, we denote by Bd the 
(i-dimensional unit ball and by S"^"^ its boundary. Let p: K ^ [0. oo) 
be a (not necessarily normalized) density, i.e. a non-negative Lebesgue- 
integrable function. We define the measure with density p by 

Ik Pi^) 

for all measurable sets A G K. For example, if p{x) = 1k{x) then tt 
is simply the uniform distribution on K. In what follows we present 
some Markov chains that can be used sample approximately from tt, 
that is, vr is their stationary distribution. We will see that each of them 
is positive, independent from the choice of the density p. 

We will define only the Markov operators for the corresponding Markov 
chains, since the corresponding Markov kernel can be obtained by ap- 
plying the operators to indicator functions. 



3.1. Hit-and-run. The hit-and-run algorithm consists of two steps: 
Starting from x e K, choose a random direction 9 e S^~^ and then 
choose the next state of the Markov chain with respect to the density 
p restricted to the chord determined by x and 9. 

For X E K and 9 e S'^~^ we denote by L{x, 9) the chord in K through 
X and X -\- 9, i.e. 

L{x,9) = {x + s^ e K I s e R}. 

Additionally we write for the volume of the {d — l)-dimensional unit 
sphere and 

(1) i{x,9)= f p{y)dy 

JL{x,e) 

for the total weigth of the chord L{x, 9). The Markov operator H that 
corresponds to the hit-and-run chain is defined by 

Hf{x) = - / / f{y) p{y) dyd9 

Kd Jsd-^ V JL{x,e) 

for all / G 1^2 (tt). To rewrite H in the desired form let /i be the product 
measure of tt and the uniform distribution on 5''^""^ and iv2(//) be the 
Hilbert space of functions g : K x 5''^"^ — > R with inner- product 

{gi,g2)n = —[ I gi{x,9) g2{x,9) d9dTi{x) 
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for gi,g2 G L2{fi). We define the operators M : L2{fi) — > ^^2(71") and 
T: L^ifi) ^ L^ifi) by 



Mg{x) = - [ g{x,e)d9 



and 

Tg{x,e) = — — / g{y,e)p{y)dy. 

Recall that the adjoint operator of M is the unique operator M* that 
satisfies {f,Mg) = {M*f,g)^ for all / e 12(71), g e L2(/i), see |Kre89l 
Thm. 3.9-2]. Since 



{f,Mg) = - f{x)g{x, 6) dOdnix), 

we obtain that, for all 6 G S**^"^ and x & K, 

M*f{x,6) = fix). 

This implies 

MTM*f{x) = - [ [ f{y)p{y) dydO = Hf{x) 

and thus, that M and T are the desired "building blocks" for Lemma [TJ 
First of all, note that by Fubini's Theorem the operator T is self- 
adjoint in L2{p). It remains to show that T is positive. We know that 
L(x, 6) = L{y, 6) for all y e L{x, 6). It follows that 



T'gix,e) = / Tgiy,e)piy)dy 

g{z,e)p{z) dz p{y) dy 



'L{x,e) 

1 r 1 



^{x, 0) 7^(^.0) £{y, 6) J^y Q) ' 

/ 9{z,0)p{z)dz / p{y)dy 

'L(x,e) J L{xfi) 



= Tg{x,e). 

Thus, T is a self-adjoint and idempotent operator on L2{p), which 
implies that T is a projection and, in particular, that it is positive, see 
e.g. |Kre89t Thm. 9.5-2]. Finally, Lemma [1] shows that H is positive. 



3.2. Gibbs sampler. The Gibbs sampler, or specifically the random 
scan Gibbs sampler, is conceptually very similar to the hit-and-run 
algorithm. In each step, we choose a direction and sample with respect 
to p restricted to the chord in this direction. But, in contrast to the 
hit-and-run, we choose the direction from the d possible directions of 
the coordinate axes. 
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Let Ci = (1, 0, . . . , 0), . . . , Crf = (0, . . . , 0, 1) be the Euclidean standard 
basis in and £{■,■) he from ([T]). The Markov operator G of the Gibbs 
sampler is given by 

for all / G L2{tt). We follow almost the same lines as for the hit- 
and-run chain. Let m be the product measure of tt and the uniform 
distribution on [d] = {1, . . . ,d}. By L2{m) we denote the Hilbert space 
of functions g: K x [d] equipped with the inner product 

{gi,g2)m = -:Y] / giix,j) g2ix,j)d7i{x) 
d Jk 

for gi,g2 € L2{m). We define the operators M: L2{m) — )■ -L2(7r) and 
T: L2(m) — )■ L2{m) by 

1 

i=i 

and 

Tg{xJ) = - r/ g{y,j) p{y)dy. 

By the same calculations as in Subsection 13.11 we obtain for all / G 
L2(7r), X e K and j G [d], that M*f{x,j) = f{x) and that G = MTM*. 
It is easily seen that T is self-adjoint and idempotent. Hence, T is a 
projection and thus, positive, which proves the assertion by Lemma [H 



3.3. Slice sampler. For any t > assume that Rt is the transition 
kernel of a Markov chain on the level set K{t) of p, i.e. 

K{t) = {x e K \ p{x) > t}. 

Further let Rt be reversible with respect to Ut, the uniform distribution 
on K{t), i.e. 

vol,(.4 n K(t)) 
"'^^^= voUA-(()) • 
where vol^ denotes the d-dimensional Lebesgue measure. The slice 
sampler, starting from a state x works as follows: First choose a level t 
uniformly distributed in [0, p{x)] and then sample the next state with 
respect to Rt{x,-) in the level set K{t). If Rt{x,-) = Ut{-) then the 
slice sampler is called simple slice sampler |MT02] . The corresponding 
Markov operator is defined by 

Rf{x) = ^ / Rtf{y)dt = — - / / f{y)Rt{x,dy)dt, 
p[x) Jo p[x) Jo Jxit) 
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for all / G i^2(7r). For any t > we assume that Rt is a positive operator 
on L2{Ut), which is the set of all square integrable real functions with 
respect Ut on K(t), i.e. 

{Rtf,f)u, = [ Rtf{x)f{x) Utidx) > 0. 

JK(t) 

To show that R is positive, let 

Kp = {{x,t) \ xeK, te [0,p{x)]} C 

and let fj. be the uniform distribution in Kp. Let L2{fJ,) be the Hilbert 
space of functions g : Kp — > M with inner product 

{gi,g2)i^= gi{x,t) g2{x,t) dfi{x,t) 

JKp 

for gi,g2 G L2{fi). We define the operators M: L2{p) — ?■ L2{7!') and 
T: L2(/i) ^L2(/i) by 

Mg{x) = ^ / g{x,t) dt. 



p{x) 
and 

Tg{x,t) = g{y,t) Rt{x,dy) 

JK(t) 

for 5( G L2(/i). The adjoint operator M* : L2{n) L2(/u) is M*f{x, t) = 
f{x). The operator T is self-adjoint, since Rt is reversible with respect 
to Ut- For the positivity define gt{x) = g{x,t), {x,t) G Kp. We have 

{TfJ)i,= / {Rtgt,gt)}ut TTrrdt, 

Jo V0ld+i(Kp) 

which implies positivity of T by the positivity of Rt- By the fact that 
R = MTM* and by Lemma [T] it is proven that R is positive. 

3.4. Metropolis algorithm. Assume we have a positive proposal ker- 
nel B which is reversible with respect to Uq, the uniform distribution 
in K. Then the Markov operator of the Metropolis algorithm is given 

by 



Mf{x) = jJ{y)a{x,y)B{xAy)+\l- J ^a{x,y)B{x,dy) j fix) 

where a{x,y) = 1 A and / G L2{'k). We interpret the Metropohs 
algorithm as a slice sampler. For t > 0, x G K{t) and A d K define 

Rt{x,A) = B{x,Ar]K{t)) + {1- B{x,K{t))) 

Recall that 

Rfix) = r^^^ [ fiy) Rtix, dy) dt. 



8 DANIEL RUDOLF AND MARIO ULLRICH 

is the Markov operator of the shce sampler and that Ut is the uniform 
distribution in K(t). 

Lemma 2. (1) If B is reversible with respect to Uq, then Rt is re- 
versible with respect to Ut for any t > 0. 

(2) If B is positive on L2(f/o), then Rt is positive on L2{Ut) for any 
t > 0. 

(3) The general slice sampler and the Metropolis algorithm coincide, 
I.e. Rf = Mf for f eL^i-K). 

Proof. We have for any f,g E L2{Ut) that 



{Rtf,g)u, = / / fiy)gix)lKit){y)B{x,dy)Ut{dx) 

JK{t) JK{t) 

+ / {I - B{x,K{t)))f{^)9{x) Ut{dx) 

JK{t) 

= iB{lKit)f)Am9)u,^^^^ 

+ / (l-5(x,K(t)))/(x)(7(x)[/i(dx). 

JK(t) 

By using the self-adjointness of B on L2{Uq) we obtain that Rt is self- 
adjoint on L2{Ut) for any t > 0, which proves ([1]). Assertion ([2]) follows 
by similar arguments. One obtains 

yoUK) 



{RtfJU = {BilKit)f),lKit)f) 



Uo 



yoUK{t)) 



+ / {i-B{x,Kmf{xyut{dx), 

JKit) 

which, by using the positivity of B, proves the positivity of Rt. Note 
that the Markov operator of the slice sampler can be written as 

Rfi^) = 4t I fiv) Bix, dy) dt 



p{x 

fix 



K{t) 

+ ^ / ' lKit){x){l-B{x,K{t)))dt 



1 



p{x) 




K Jo 



lK(t) (x) lK{t) (y) dt f{y) B{x, dy) 
1 



iKit)ix)lK(t)iy) dtBix,dy) 



Then ([3]) follows by 
1 



Pix) Jo 



p{x) 

lK(t)ix)lK{t){y)dt = a{x,y). 
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□ 



Note that, by the previous lemma all assumptions of Subsection 13.31 
are satisfied if we additionally assume that B is positive. Hence, the 
Metropolis algorithm defines a positive Markov operator if the proposal 
is positive. 
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